L1 Regularized Regression for Reranking and System Combination in Machine Translation

نویسندگان

  • Ergun Biçici
  • Deniz Yuret
چکیده

We use L1 regularized transductive regression to learn mappings between source and target features of the training sets derived for each test sentence and use these mappings to rerank translation outputs. We compare the effectiveness of L1 regularization techniques for regression to learn mappings between features given in a sparse feature matrix. The results show the effectiveness of using L1 regularization versus L2 used in ridge regression. We show that regression mapping is effective in reranking translation outputs and in selecting the best system combinations with encouraging results on different language pairs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Model Weighting and Transductive Regression for Predicting Best System Combinations

We analyze adaptive model weighting techniques for reranking using instance scores obtained by L1 regularized transductive regression. Competitive statistical machine translation is an on-line learning technique for sequential translation tasks where we try to select the best among competing statistical machine translators. The competitive predictor assigns a probability per model weighted by t...

متن کامل

RegMT System for Machine Translation, System Combination, and Evaluation

We present the results we obtain using our RegMT system, which uses transductive regression techniques to learn mappings between source and target features of given parallel corpora and use these mappings to generate machine translation outputs. Our training instance selection methods perform feature decay for proper selection of training instances, which plays an important role to learn correc...

متن کامل

A Voted Regularized Dual Averaging Method for Large-Scale Discriminative Training in Natural Language Processing

We propose a new algorithm based on the dual averaging method for large-scale discriminative training in natural language processing (NLP), as an alternative to the perceptron algorithms or stochastic gradient descent (SGD). The new algorithm estimates parameters of linear models by minimizing L1 regularized objectives and are effective in obtaining sparse solutions, which is particularly desir...

متن کامل

The Regression Model of Machine Translation

Machine translation is the task of automatically nding the translation of a source sentence in the target language. Statistical machine translation (SMT) use parallel corpora or bilingual paired corpora that are known to be translations of each other to nd a likely translation for a given source sentence based on the observed translations. The task of machine translation can be seen as an insta...

متن کامل

The RWTH System Combination System for WMT 2010

RWTH participated in the System Combination task of the Fifth Workshop on Statistical Machine Translation (WMT 2010). For 7 of the 8 language pairs, we combine 5 to 13 systems into a single consensus translation, using additional n-best reranking techniques in two of these language pairs. Depending on the language pair, improvements versus the best single system are in the range of +0.5 and +1....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010